System and method for state estimation in a noisy machine-learning environment

ABSTRACT

A system and method for estimating a system state. The method includes constructing a first estimate of a system state at a first time including a first covariance matrix describing an accuracy of the first estimate. A second estimate of the state is constructed at a second time, after the first time, including a second covariance matrix. A value of a characteristic of the system state is measured at the second time and the second estimate of the system state and the second covariance matrix are adjusted based on the value of the characteristic. A third estimate of the system state is constructed at a third time, before the second time, including a third covariance matrix describing an accuracy of the third estimate. A fourth estimate of the system state is constructed at a fourth time being after the second time.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional PatentApplication No. 62/756,044, entitled “Hybrid AI,” filed Nov. 5, 2018,which is incorporated herein by reference.

This application is related to U.S. application Ser. No. 15/611,476entitled “PREDICTIVE AND PRESCRIPTIVE ANALYTICS FOR SYSTEMS UNDERVARIABLE OPERATIONS,” filed Jun. 1, 2017, which is incorporated hereinby reference.

This application is related to U.S. Provisional Application No.62/627,644 entitled “DIGITAL TWINS, PAIRS, AND PLURALITIES,” filed Feb.7, 2018, converted to U.S. application Ser. No. 16/270,338 entitled“SYSTEM AND METHOD THAT CHARACTERIZES AN OBJECT EMPLOYING VIRTUALREPRESENTATIONS THEREOF,” filed Feb. 7, 2019, which are incorporatedherein by reference.

This application is related to U.S. application Ser. No. ______(Attorney Docket No. INC-031B), entitled “SYSTEM AND METHOD FOR ADAPTIVEOPTIMIZATION,” filed Nov. 5, 2019, U.S. application Ser. No. ______(Attorney Docket No. INC-031C), entitled “SYSTEM AND METHOD FORCONSTRUCTING A MATHEMATICAL MODEL OF A SYSTEM IN AN ARTIFICIALINTELLIGENCE ENVIRONMENT,” filed Nov. 5, 2019, and U.S. application Ser.No. ______ (Attorney Docket No. INC-031D, entitled “SYSTEM AND METHODFOR VIGOROUS ARTIFICIAL INTELLIGENCE,” filed Nov. 5, 2019, which areincorporated herein by reference.

RELATED REFERENCES

Each of the references cited below are incorporated herein by reference.

U.S. Patents

Patent Number Issue Date Patentee 10,068,170 Sep. 4, 2018 Golovashkin,et al.

U.S. Patent Application Publications

Publication Number Kind Code Publication Date Applicant 20190036639 A1Jan. 31, 2019 Huang; Yan; et al.

Nonpatent Literature Documents

-   Makridakis, S. et al., “Statistical and Machine Learning Forecasting    Methods: Concerns and Ways Forward” (2018)-   Box, G. E. P. et al., “Time Series Analysis Forecasting and Control”    (2016)-   Hyndman, R. J. and Athanasopoulos, G., “Forecasting: Principles and    Practice” (2014)-   Brown, R. G. and Hwang, P. Y. C., “Introduction to Random Signals    and Applied Kalman Filtering with MATLAB Exercises” (2012)-   Grewal, M. S. and Andrews, A. P., “Kalman filtering: Theory and    Practice Using MATLAB” (2008)-   Jategaonkar, R., “Flight Vehicle System Identification: A Time    Domain Methodology” (2006)-   Simon, D., “Optimal State Estimation: Kalman, H-infinity, and    Nonlinear Approaches” (2006)-   Zarchan, P., “Fundamentals of Kalman Filtering: A Practical    Approach” (2005)-   Desat, U. B. et al., “Discrete-Time Complementary Models and    Smoothing Algorithms: The Correlated Noise Case” (1983)-   Van Loan, C. F., “Computing Integrals Involving the Matrix    Exponential” (1978)-   Gelb, A., “Applied Optimal Estimation” (1974)-   Fraserand, D. C. and Potter, J. E., “The Optimum Linear Smoother as    a Combination of Two Linear Filters” (1969)-   Meditch, J. S., “Stochastic Optimal Linear Estimation and Control”    (1969)-   Raush, H. E. et al., “Maximum Likelihood Estimates of Linear Dynamic    Systems” (1965)-   Kalman, R. E., “A New Approach to Linear Filtering and Prediction    Problems” (1960)

TECHNICAL FIELD

The present disclosure is directed, in general, to state trackingsystems and, more specifically, to a system and method for estimatingthe state of a system in a noisy measurement environment.

BACKGROUND

Niels Bohr is often quoted as saying, “Prediction is very difficult,especially about the future.” Prediction is the practice of extractinginformation from sets of data to identify patterns and predict futurebehavior. The Institute for Operations Research and the ManagementSciences (INFORMS) defines several types of prediction which have beenexpanded here for clarity:

-   -   Descriptive Analytics, in which enormous amounts of historical        (big) data are used for describing patterns extracted solely        from the data.    -   Predictive Analytics, where, in addition to descriptive        analytics, subject matter expert (SME) knowledge is included to        capture attributes not reflected in the big data by itself.    -   Decision Analytics, where, in addition to predictive analytics,        influence diagrams (mathematical networks or models) are        architected to address decision-making strategy.    -   Prescriptive Analytics, where, in addition to decision        analytics, advanced mathematical techniques (e.g. optimization)        are leveraged to predict missing data.        The techniques of prescriptive analytics include data        modeling/mining, artificial intelligence (AI), supervised or        unsupervised machine learning, and deep learning. The focus of        the present disclosure is with machine and deep learning with        the intent to train a network or characterize a model for        prediction about the future as new data is assimilated.

Table 1 (below) provides a sampling of the state of the art of machinelearning methods, where each method has its own set of implementationrequirements; it was presented by Anais Dotis-Georgiou at the Big Dataand Artificial Intelligence Conference in 2019 at Addison, Tex.

TABLE 1 Regression Classification Soft Clustering Hard ClusteringEnsemble methods Decision trees Fuzzy-C means Hierarchical clusteringGaussian process Discriminant analysis Gaussian mixture K-means Generallinear model K-nearest neighbor K-medoids Linear regression Logisticregression Self-organizing maps Nonlinear regression naïve BayesRegression tree Neural nets Support vector machine Support vectormachine

In the case of supervised learning, the designer is required to manuallyselect features, choose the classifier method, and tune thehyperparameters. In the case of unsupervised learning, some algorithms(e.g., k-means, k-medoid, and fuzzy c-means) require the number ofclusters to be selected a priori; principal component analysis requiresthe data to be scaled, assumes the data is orthogonal, and results inlinear correlation; nonnegative matrix factorization requiresnormalization of the data; and factor analysis is subject tointerpretation.

Deep learning brings with it its own set of demands. Enormous computingpower through high performance graphics processing units (GPUs) isneeded to process big data, on the order of 10⁵ to 10⁶ points. Also, thedata must be numerically tagged. Furthermore, it takes a long time totrain a model. In the end, because of the depth of complexity, it'svirtually impossible to understand how conclusions were reached.

The artificial neural network (ANN) architecture supporting machine/deeplearning is supposedly inspired by the biologic nervous system. Themodel learns through a process called back propagation which is aniterative gradient method to reduce the error between the input andoutput data. But humans do not back-propagate when learning, so theanalogy is weak in that regard. Other drawbacks include the following:

-   -   ANNs are shallow. While little innate knowledge is required on        behalf of the practitioner, architectural selection becomes an        exercise in numerical investigation.    -   Arbitrary numbers of hidden layers and nodes comprise the depth        of deep learning. The practitioner is often unable to explain        why one architecture is used over another, not to mention the        practitioner has no control or influence over what is being        learned.    -   ANNs are greedy and brittle. Big data (and big computing) is        required to train/test the model which often breaks when        presented with new data.    -   ANNs are opaque. There is generally a lack of transparency due        to the difficulty in understanding the connection between inputs        and outputs—especially for deep learning. This unknown        opaqueness leaves the practitioner wondering if the architecture        can be trusted.

FIG. 1 reveals machine learning forecasting performance is worse thanthe application of statistical methods, when comparing the vertical axiswhich represents the symmetric mean absolute percentage error of thevarious methods. FIG. 1 shows exponential smoothing(Error-Trend-Seasonality, “ETS”) as being superior to all others from astudy. Noticeably absent from the field of comparison are optimalestimators, e.g., linear quadratic estimators, commonly called Kalmanfilters. It is believed that the prior art approaches have not includedapplying optimal estimation techniques to predictive or prescriptiveanalytics, let alone combining all three techniques: filtering,smoothing, and predicting. This is reasonable since the Kalman filter(without smoothing or predicting), and its variants, is typically usedfor state estimation in many control system applications.

A system is needed which improves upon machine learning and statisticalmethods such that the practitioner can perform real-time predictive andprescriptive analytics. The system should avoid the pitfalls ofartificial neural networks with their arbitrary hidden layers, iterativefeature and method selection, and hyperparameter tuning. Furthermore,the system should not require enormous computing power. Preferably, sucha system will overcome state estimation challenges in a noisymeasurement environment.

SUMMARY

Deficiencies of the prior art are generally solved or avoided, andtechnical advantages are generally achieved, by advantageous embodimentsof the present disclosure of a system and method for estimating a stateof a system. The method is characterized by constructing a firstestimate of a state of a system at a first time including a firstcovariance matrix describing an accuracy of the first estimate. A secondestimate of the state of the system is constructed at a second time,after the first time, including a second covariance matrix describing anaccuracy of the second estimate and employing a dynamic model of thestate of the system. A value of a characteristic of the state of thesystem is measured at the second time and the second estimate of thestate of the system and the second covariance matrix are adjusted basedon the value of the characteristic. A third estimate of the state of thesystem is constructed at a third time, before the second time, includinga third covariance matrix describing an accuracy of the third estimateand employing the dynamic model of the state of the system. A fourthestimate of the state of the system is constructed at a fourth time,after the second time, from the second estimate.

The foregoing has outlined rather broadly the features and technicaladvantages of the present disclosure in order that the detaileddescription of the disclosure that follows may be better understood.Additional features and advantages of the disclosed embodiments will bedescribed hereinafter, which form the subject matter of the claims. Itshould be appreciated by those skilled in the art that the conceptionand specific embodiment disclosed may be readily utilized as a basis formodifying or designing other structures or processes for carrying outthe same purposes of the present disclosure, and that such equivalentconstructions do not depart from the spirit and scope of the disclosureas set forth in the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present disclosure, referenceis now made to the following detailed description taken in conjunctionwith the accompanying drawings, in which:

FIG. 1 illustrates a graphical representation comparing forecastingperformance (Symmetric Mean Absolute Percentage Error) of machinelearning and statistical methods for known machine learning processes;

FIG. 2 illustrates a process for smoothing over fixed time intervals;

FIG. 3 illustrates a graphical representation of filtering, smoothing,and prediction performance for analytics;

FIGS. 4A and 4B illustrate performance examples of temperature trackingand vibration filtering in a noisy measurement environment;

FIG. 5 illustrates a flow diagram of an embodiment of a method ofestimating the state of a system; and,

FIG. 6 illustrates a block diagram of an embodiment of an apparatus forestimating the state of a system.

Corresponding numerals and symbols in the different figures generallyrefer to corresponding parts unless otherwise indicated and, in theinterest of brevity, may not be described after the first instance.

DETAILED DESCRIPTION

The making and using of exemplary embodiments of the disclosed inventionare discussed in detail below. It should be appreciated, however, thatthe general embodiments are provided to illustrate the inventiveconcepts that can be embodied in a wide variety of specific contexts,and the specific embodiments are merely illustrative of specific ways tomake and use the systems, subsystems, and modules for estimating thestate of a system in a real-time, noisy measurement, machine-learningenvironment. While the principles will be described in the environmentof a linear system in a real-time machine-learning environment, anyenvironment such as a nonlinear system, or a non-real-timemachine-learning environment, is within the broad scope of the disclosedprinciples and claims.

Intelligent prediction is a system introduced herein that uniquelycombines the three forms of optimal estimation (filtering, smoothing,and predicting) to provide utility for predictive and prescriptiveanalytics with applications to real-time sensor data. The results ofthis systematic approach outperform the best of the current statisticalmethods and, as such, outperform machine learning methods.

To perform predictive and prescriptive analytics in real-time and avoidthe pitfalls of machine learning which utilizes artificial neuralnetworks, the system architecture introduced herein is based oncombining three types of estimation: filtering, smoothing, andpredicting, which is an approach new to forecasting, and which it isbelieved has not been previously considered with regard to machinelearning, as illustrated in Table 1.

FIG. 1, to which reference is now made, illustrates on the vertical axisa symmetric mean absolute percentage error for forecasting performanceof machine learning and statistical methods for known machine learningprocesses. The first part of the combined three-part system is animplementation of a discrete-time Kalman filter. The filtering form ofoptimal estimation is when an estimate coincides with the lastmeasurement point.

With the following definitions:

x_(k)=(n×1), state vector at time t_(k)

ϕ_(k)=(n×n), state transition matrix

w_(k)=(n×1), process white noise, w_(k)˜(0, Q_(k))

Q_(k) is the covariance of the process noise

z_(k)=(m×1), measurement vector at time t_(k)

H_(k)=(m×n), measurement matrix

v_(k)=(m×1), measurement white noise, v_(k)˜(0, R_(k))

R_(k) is the covariance of the measurement noise,

the dynamic process is described by

x _(k+1)=ϕ_(k) x _(k) +w _(k),

measurements are described by

z _(k) =H _(k) x _(k) +v _(k),

and initial conditions given by

{circumflex over (x)} ₀ ⁻ =E[x ₀]

P ₀ ⁻ =E[(x ₀ −{circumflex over (x)} ₀ ⁻)(x ₀ −{circumflex over (x)} ₀⁻)^(T)].

The discrete-time Kalman filter recursive equations are given by

K_(k) = P_(k) ⁻H_(k) ^(T) (H_(k)P_(k) ⁻H_(k) ^(T) + R_(k))⁻¹ Filteringgain {circumflex over (x)}_(k) = {circumflex over (x)}_(k) ⁻ +K_(k)(z_(k) − H_(k){circumflex over (x)}⁻ _(k)) State measurement estimateP_(k) = (I − K_(k)H_(k))P⁻ _(k)(I − K_(k)H_(k))^(T) + K_(k)R_(k)K_(k)^(T) State measurement covariance {circumflex over (x)}_(k+1) =ϕ_(k){circumflex over (x)}_(k) State time estimate at a next time stept_(k+1) P_(k+1) ⁻ = ϕ_(k)P_(k)ϕ_(k) ^(T) + Q_(k) State time covariance(also at next time step)

It is worthy to mention C. F. van Loan's method is employed to computeϕ_(k) and Q_(k). As previously mentioned, the Kalman filter has numerousapplications for guidance, navigation, and control of aerospacevehicles, e.g., aircraft, spacecraft, rockets, and missiles. However,the filter will be combined, as introduced herein, with smoothing andpredicting with applications to (possibly real-time) predictive andprescriptive analytics.

The second part of the three-part system is an implementation ofdiscrete fixed-interval smoothing. The smoothing form of optimalestimation is when an estimate falls within a span of measurementpoints. For the proposed system, the time interval of the measurementsis fixed (hence the name) and optimal estimates of the (saved) states{circumflex over (x)}_(k) are obtained.

With initial conditions given by the last a posteriori estimate andcovariance from the filter

x_(T) ^(s)=x_(T)

P_(T) ^(s)=P_(T),

the smoother sweeps backward recursively

C_(k) = P_(k)ϕ_(k) ^(T) (P_(k+1) ⁻)⁻¹ Smoothing gain {circumflex over(x)}^(s) _(k) = {circumflex over (x)}_(k) + C_(k) (x_(k+1) ^(s) −ϕ_(k){circumflex over (x)}_(k)) State smoothing estimate P_(k) ^(s) =P_(k) + C_(k) (P_(k+1) ^(s) − P_(k−1) ⁻) C_(k) ^(T) State smoothingcovariance

The third part of the three-part system is an implementation of apredictor. The predicting form of optimal estimation is when an estimatefalls beyond the last measurement point. The equations for the predictorare identical to the filter with the following three exceptions:

-   -   (1) The initial conditions are given by the last a posteriori        estimate and covariance of the smoother:

{circumflex over (x)} ₀ ⁻ =x _(T) ^(s)

P ₀ ⁻ =P _(T) ^(s),

-   -   (2) The covariance of the measurement noise R_(k) is set to a        large value rendering the measurements worthless because there        aren't any measurements available with prediction.    -   (3) As such, z_(k) is fixed to the value of last measurement.        The predictor propagates forward for the forecast period of        interest. These predictions may be at various points in the        future over various time periods. For example, one analytics        application might predict temperature one minute into the future        and/or five minutes into the future. Also, there may be        temperature predictions at hourly or daily rates. These example        combinations would be dependent on the application, of course.

Those skilled in the art know how to model dynamic process andmeasurements described by x_(k+1)=ϕ_(k)x_(k)+w_(k) andz_(k)=H_(k)x_(k)+v_(k), respectively. Thus, once initialized with{circumflex over (x)}₀ ⁻ and P₀ ⁻, the five-step Kalman filter iteratesrecursively until the set of data to be filtered is exhausted resultingin state estimates {circumflex over (x)}_(k) and state covariancesP_(k).

Upon saving the state estimates {circumflex over (x)}_(k), the statecovariances P_(k), and properly initializing x_(T) ^(s) and P_(T) ^(s)with the last entries of {circumflex over (x)}_(k)(T_(f)) andP_(k)(T_(f)), the three-step smoother iterates recursively with abackward sweep to an earlier time point, as illustrated in FIG. 2,showing fixed-intervals smoothing, until all states/covariances areconsumed. At this point, the system is prepared for predictive andprescriptive analytics.

The last state estimate and state covariance of the smoother is used toinitialize the predictor. The predictor runs just like the five-stepKalman filter with two exceptions: (i) the covariance of the measurementnoise R_(k) is set to an arbitrarily large value to indicate themeasurements are worthless, because there are not any, and (ii) themeasurement z_(k) is fixed to its final value, because that is the lastpiece of information available.

An application of the disclosed three-part system's implementation isshown in FIG. 3 illustrating an example of filtering, smoothing, andpredicting for analytics where the future position of a moving object ispredicted. In FIG. 3, the actual position of an object is depicted onthe vertical axis with a dashed line, the filtered position isrepresented by dots, the smoothed position is denoted by a solid line,and the predicted position is shown with “x” marking the spots.Measurements are filtered and smoothed up until 50 seconds, at whichtime, predictions are made 30 seconds into the future.

Referring again to FIG. 1, if performance of a three-part system isbetter than ETS, then its performance is better than the rest. In thissection, details are presented showing filtering with prediction isbetter than ETS. When combined with smoothing, performance improves andforms the basis of the accurate prediction shown in FIG. 3.

Data used to perform the analysis was gathered from a live, operatingHorizontal Pump System (HPS). Filtering and predicting was tested witheighteen data sets. Measurements were taken every hour so that oneforecasting period represents one hour of elapsed time. The datameasures various components of the HPS including bearing and windingtemperatures in the motor, pump vibration and suction pressure, overallsystem health, and other attributes of the system.

Each data set includes noise which may vary with time. The differentsources of data provide a mixture of different characteristics such asseasonality, trends, impulses, and randomness. For example, temperaturedata is affected by the day/night (diurnal) cycle which creates a(short) seasonal characteristic. Vibration data, however, is notaffected by the day/night cycle and is not seasonal but does contain asignificant portion of randomness.

A missed prediction occurs when an observed measurement exceeds athreshold value, but no forecast was produced which predicted theexception. Any forecast which predicted the exception within twelveperiods leading up to the exception was not considered because such ashort forecast is not useful. A prediction strategy should produce asfew missed predictions as possible.

Turning now to Table 2 (below), illustrated are temperature andvibration filtering after filtering and predicting, showingsensitivities, forecast lengths, and average percent of missedpredictions.

TABLE 2 Forecast Average % Missed Length Predictions Periods Low HighStrategy Sensitivity (Days) Noise Noise ETS High 24 (1) 18.92% 43.19%Filter/Predictor High 24 (1) 0.00% 0.00% ETS Medium 24 (1) 16.67% 42.94%Filter/Predictor Medium 24 (1) 0.00% 0.00% ETS Low 24 (1) 61.17% 62.05%Filter/Predictor Low 24 (1) 24.96% 13.89% ETS High 336 (14) 0.00% 5.56%Filter/Predictor High 336 (14) 0.00% 0.00% ETS Medium 336 (14) 0.00%2.78% Filter/Predictor Medium 336 (14) 0.00% 0.00% ETS Low 336 (14)0.00% 5.56% Filter/Predictor Low 336 (14) 0.00% 0.00%Table 2 compares each strategy (ETS versus filter/prediction) over 24periods (1 day) and 336 periods (14 days). The filter sensitivity columnrefers to how closely the signal is being tracked. For instance,temperature changes slowly over time so the filter/predictioncombination is set to high sensitivity to track the slowly changingsignal; whereas vibration, which contains high frequency noise is set tolow sensitivity, as illustrated in FIG. 4 showing performance examplesof temperature tracking and vibration filtering in a noisy measurementenvironment. The Average % Missed Predictions column in Table 2 showsthe probability that, given the data has crossed a critical threshold,the corresponding strategy failed to predict the event. A lowerpercentage indicates better performance for this metric. In eachscenario (low noise/high noise), the filter/predictor strategy with ahigh or medium sensitivity correctly predicted every event. Even in thecase of low sensitivity, the filter/predictor strategy outperformed theETS strategy.

The filtering, smoothing, and predicting process introduced hereinoutperforms the ETS strategy which was the basis of performanceassessment over machine learning strategies as shown in Table 1. Theseresults appear to be independent of sensitivity setting (low, medium, orhigh). Therefore, in general, a practitioner could use thefilter/predictor strategy to avoid missed predictions. Furthermore, withthe inclusion of smoothing, these results are improved upon as shown inFIG. 3.

Turning now to FIG. 5, illustrated is a flow diagram of an embodiment ofa method 500 of estimating a state of a system. The method 500 may beemployable to estimate a state of a system in a machine learning and/ornoisy measurement environment. The method 500 is operable on a processorsuch as a microprocessor coupled to a memory, the memory containedinstructions which, when executed by the processor, are operative toperform the functions. The method 500 begins at a start step or module505.

At a step or module 510, a first estimate of a state of a system isconstructed at a first time including a first covariance matrixdescribing an accuracy of the first estimate.

At a step or module 520, a second estimate of the state of said systemis constructed at a second time, after the first time, including asecond covariance matrix describing an accuracy of the second estimateemploying a dynamic model of the state of the system; the dynamic modelcomprises a matrix with coefficients that describes a temporal evolutionof the state of the system.

At a step or module 530, a value of a characteristic of the state of thesystem is measured at the second time. Measuring the value of thecharacteristic can include making a plurality of independentmeasurements characterized by a diagonal measurement covariance matrix.At a step or module 540, the second estimate of the state of the systemand the second covariance matrix are adjusted based on the value of thecharacteristic.

At a step or module 550, a third estimate of the state of the system isconstructed at a third time, before the second time, including a thirdcovariance matrix describing an accuracy of the third estimate employingthe dynamic model of the state of the system.

At a step or module 560, a fourth estimate of the state of the system isconstructed at a fourth time, after the second time, from the secondestimate. In some embodiments, the fourth time is on a different timescale from the first, second and third times

At a step or module 570, the dynamic model is altered in response to thevalue of the characteristic.

At a step or module 580, the state of the system is reported based onthe fourth estimate.

At a step or module 590, a fifth estimate of the state of the system isconstructed at a fifth time, after the second time, from the secondestimate.

In certain embodiments, the dynamic model is a linear dynamic model withconstant coefficients. In an embodiment, constructing the first estimateand constructing the second estimate are performed by a Kalman filter.

At a step or module 595, the state of the system is altered based on thefourth estimate.

The method 500 terminates at end step or module 598.

The impacts to implementation of predictive analysis of processesintroduced herein cannot be understated. Whereas machine learningapproaches are directly dependent on a large and fully populatedtraining corpus, purely statistical approaches, such as ETS and thenovel filter/predictor strategy introduced herein, learn directly fromthe real-time signal with additional data or knowledge imposed. Basedupon the findings indicated in Table 1, the established ETS approach isalready of better performance than the more widely used machine learningtechniques. The improvements and advantages of the process introducedherein over ETS (shown in Table 2) only solidifies the merits of the newapproach.

In short, advantages of the novel filtering, smoothing, and predictingprocess do not requiring a priori knowledge as it does for machinelearning techniques. Because the system combines optimal estimationtechniques of filtering, smoothing, and predicting, there are nodependencies on artificial neural nets and their (shallow, greedy,brittle, and opaque) shortcomings.

Turning now to FIG. 6, illustrated is a block diagram of an embodimentof an apparatus 600 for estimating the state of a system in a machinelearning environment. The apparatus 600 is configured to performfunctions described hereinabove of constructing the estimate of thestate of the system. The apparatus 600 includes a processor (orprocessing circuitry) 610, a memory 620 and a communication interface630 such as a graphical user interface.

The functionality of the apparatus 600 may be provided by the processor610 executing instructions stored on a computer-readable medium, such asthe memory 620 shown in FIG. 6. Alternative embodiments of the apparatus600 may include additional components (such as the interfaces, devicesand circuits) beyond those shown in FIG. 6 that may be responsible forproviding certain aspects of the device's functionality, including anyof the functionality to support the solution described herein.

The processor 610 (or processors), which may be implemented with one ora plurality of processing devices, perform functions associated with itsoperation including, without limitation, performing the operations ofestimating the state of a system, computing covariance matrices, andestimating a future state of the system. The processor 610 may be of anytype suitable to the local application environment, and may include oneor more of general-purpose computers, special purpose computers,microprocessors, digital signal processors (“DSPs”), field-programmablegate arrays (“FPGAs”), application-specific integrated circuits(“ASICs”), and processors based on a multi-core processor architecture,as non-limiting examples.

The processor 610 may include, without limitation, applicationprocessing circuitry. In some embodiments, the application processingcircuitry may be on separate chipsets. In alternative embodiments, partor all of the application processing circuitry may be combined into onechipset, and other application circuitry may be on a separate chipset.In still alternative embodiments, part or all of the applicationprocessing circuitry may be on the same chipset, and other applicationprocessing circuitry may be on a separate chipset. In yet otheralternative embodiments, part or all of the application processingcircuitry may be combined in the same chipset.

The memory 620 (or memories) may be one or more memories and of any typesuitable to the local application environment, and may be implementedusing any suitable volatile or nonvolatile data storage technology suchas a semiconductor-based memory device, a magnetic memory device andsystem, an optical memory device and system, fixed memory and removablememory. The programs stored in the memory 620 may include programinstructions or computer program code that, when executed by anassociated processor, enable the respective apparatus 600 to perform itsintended tasks. Of course, the memory 620 may form a data buffer fordata transmitted to and from the same. Exemplary embodiments of thesystem, subsystems, and modules as described herein may be implemented,at least in part, by computer software executable by the processor 610,or by hardware, or by combinations thereof.

The communication interface 630 modulates information for transmissionby the respective apparatus 600 to another apparatus. The respectivecommunication interface 630 is also configured to receive informationfrom another processor for further processing. The communicationinterface 630 can support duplex operation for the respective otherprocessor 600.

In summary, the inventions disclosed herein combine three techniques ofoptimal estimation of the state of a system. The three techniquesinclude filtering, smoothing, and predicting processes, and can beperformed, without limitation, in a machine learning and/or a noisymeasurement environment.

The filtering portion of optimal estimation is performed to construct afirst estimate of a state vector x_(k) at a time point t_(k) thatcoincides with a measurement of a value of characteristic of the stateof the system at the time point t_(k). The filtering process employs acovariance matrix that describes the accuracy of the first estimate ofthe state vector x_(k) at the time point t_(k). A second estimate of thestate vector x_(k+1) at the time point t_(k+1) is then constructed bypropagating the state of the system forward to a second time pointt_(k+1), the second time point being after the first time point. Thepropagating forward employs a dynamic model of the state of the systemto produce the estimate of the state vector x_(k+1) at the second timepoint t_(k+1). The first estimate of the state vector x_(k) andconstructing the second estimate of the state vector x_(k+1) can beperformed by employing a Kalman filter.

The dynamic model can employ a matrix with coefficients that describestemporal evolution of the state of the system. In certain embodiment,the dynamic model is a linear dynamic model with constant coefficients.

A value of a characteristic of the state of the system x_(k+1) ismeasured at the second time point t_(k+1). The second estimate of thestate of the system and the second covariance matrix are adjusted basedon the measured value of the characteristic at the second time pointt_(k+1).

Measuring the value of the characteristic can include making a pluralityof independent measurements characterized by a diagonal measurementcovariance matrix.

The smoothing portion of optimal estimation is performed by constructinga third state estimate for a time point that is earlier than the timepoint t_(k+1). The earlier time point can fall within or before a spanof current measurement points, e.g., between or before the time pointst_(k) and t_(k+1).

The predicting portion then propagates the state estimate forward for aforecast period of interest. The last state estimate and statecovariance of the smoother can be used to initialize the predicting. Thepredictions may be at various time points in the future and over varioustime scales that are after the second time point. Measurement noiseR_(k) can be set to an arbitrarily large value to accommodate theinherent absence of a state measurement at a future time point. Theinitial conditions for the prediction can be taken as the last aposteriori state estimate and the covariance of the smoother.

As described above, the exemplary embodiments provide both a method andcorresponding apparatus consisting of various modules providingfunctionality for performing the steps of the method. The modules may beimplemented as hardware (embodied in one or more chips including anintegrated circuit such as an application specific integrated circuit),or may be implemented as software or firmware for execution by aprocessor. In particular, in the case of firmware or software, theexemplary embodiments can be provided as a computer program productincluding a computer readable storage medium embodying computer programcode (i.e., software or firmware) thereon for execution by the computerprocessor. The computer readable storage medium may be non-transitory(e.g., magnetic disks; optical disks; read only memory; flash memorydevices; phase-change memory) or transitory (e.g., electrical, optical,acoustical or other forms of propagated signals-such as carrier waves,infrared signals, digital signals, etc.). The coupling of a processorand other components is typically through one or more busses or bridges(also termed bus controllers). The storage device and signals carryingdigital traffic respectively represent one or more non-transitory ortransitory computer readable storage medium. Thus, the storage device ofa given electronic device typically stores code and/or data forexecution on the set of one or more processors of that electronic devicesuch as a controller.

Although the embodiments and its advantages have been described indetail, it should be understood that various changes, substitutions, andalterations can be made herein without departing from the spirit andscope thereof as defined by the appended claims. For example, many ofthe features and functions discussed above can be implemented insoftware, hardware, or firmware, or a combination thereof. Also, many ofthe features, functions, and steps of operating the same may bereordered, omitted, added, etc., and still fall within the broad scopeof the various embodiments.

Moreover, the scope of the various embodiments is not intended to belimited to the particular embodiments of the process, machine,manufacture, composition of matter, means, methods and steps describedin the specification. As one of ordinary skill in the art will readilyappreciate from the disclosure, processes, machines, manufacture,compositions of matter, means, methods, or steps, presently existing orlater to be developed, that perform substantially the same function orachieve substantially the same result as the corresponding embodimentsdescribed herein may be utilized as well. Accordingly, the appendedclaims are intended to include within their scope such processes,machines, manufacture, compositions of matter, means, methods, or steps.

1. A method, comprising: constructing a first estimate of a state of asystem at a first time including a first covariance matrix describing anaccuracy of said first estimate; constructing a second estimate of saidstate of said system at a second time being after said first timeincluding a second covariance matrix describing an accuracy of saidsecond estimate employing a dynamic model of said state of said system;measuring a value of a characteristic of said state of said system atsaid second time; adjusting said second estimate of said state of saidsystem and said second covariance matrix based on said value of saidcharacteristic; constructing a third estimate of said state of saidsystem at a third time being before said second time including a thirdcovariance matrix describing an accuracy of said third estimateemploying said dynamic model of said state of said system; and,constructing a fourth estimate of said state of said system at a fourthtime being after said second time from said second estimate.
 2. Themethod recited in claim 1, further comprising altering said dynamicmodel in response to said value of said characteristic.
 3. The methodrecited in claim 1, further comprising reporting said state of saidsystem based on said fourth estimate.
 4. The method recited in claim 1,further comprising constructing a fifth estimate of said state of saidsystem at a fifth time being after said second time from said secondestimate.
 5. The method recited in claim 1, wherein said dynamic modelis a linear dynamic model with constant coefficients.
 6. The methodrecited in claim 1, wherein said constructing said first estimate andconstructing said second estimate are performed by a Kalman filter. 7.The method recited in claim 1, further comprising altering said state ofsaid system based on said fourth estimate.
 8. The method recited inclaim 1, wherein said measuring said value of said characteristicfurther comprising making a plurality of independent measurementscharacterized by a diagonal measurement covariance matrix.
 9. The methodrecited in claim 1, wherein said dynamic model comprises a matrix withcoefficients that describes a temporal evolution of said state of saidsystem.
 10. The method recited in claim 1, wherein said fourth time ison a different time scale from said first, second and third times. 11.An apparatus operable to construct the state of a system in a noisymeasurement environment, comprising: processing circuitry coupled to amemory, configured to: construct a first estimate of a state of a systemat a first time including a first covariance matrix describing anaccuracy of said first estimate; construct a second estimate of saidstate of said system at a second time being after said first timeincluding a second covariance matrix describing an accuracy of saidsecond estimate employing a dynamic model of said state of said system;measure a value of a characteristic of said state of said system at saidsecond time; adjust said second estimate of said state of said systemand said second covariance matrix based on said value of saidcharacteristic; construct a third estimate of said state of said systemat a third time being before said second time including a thirdcovariance matrix describing an accuracy of said third estimateemploying said dynamic model of said state of said system; and constructa fourth estimate of said state of said system at a fourth time beingafter said second time from said second estimate.
 12. The apparatusrecited in claim 11, wherein said processing circuitry is furtherconfigured to alter said dynamic model in response to said value of saidcharacteristic.
 13. The apparatus recited in claim 11, wherein saidprocessing circuitry is further configured to report said state of saidsystem based on said fourth estimate.
 14. The apparatus recited in claim11 wherein said processing circuitry is further configured to constructa fifth estimate of said state of said system at a fifth time beingafter said second time from said second estimate.
 15. The apparatusrecited in claim 11 wherein said dynamic model is a linear dynamic modelwith constant coefficients.
 16. The apparatus recited in claim 11wherein said constructing said first estimate and constructing saidsecond estimate are performed by a Kalman filter.
 17. The apparatusrecited in claim 11 wherein said processing circuitry is furtherconfigured to alter said state of said system based on said fourthestimate.
 18. The apparatus recited in claim 11 wherein said measuringsaid value of said characteristic further comprises making a pluralityof independent measurements characterized by a diagonal measurementcovariance matrix.
 19. The apparatus recited in claim 11 wherein saiddynamic model comprises a matrix with coefficients that describes atemporal evolution of said state of said system.
 20. The apparatusrecited in claim 11 wherein said fourth time is on a different timescale from said first, second and third times.